Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 5336 |
| Missing cells | 5140 |
| Missing cells (%) | 6.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 625.4 KiB |
| Average record size in memory | 120.0 B |
Variable types
| NUM | 15 |
|---|
Reproduction
| Analysis started | 2021-05-20 19:43:57.163726 |
|---|---|
| Analysis finished | 2021-05-20 19:44:41.739270 |
| Duration | 44.58 seconds |
| Version | pandas-profiling v2.7.1 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
original_matur_years is highly correlated with years_to_matur | High correlation |
years_to_matur is highly correlated with original_matur_years | High correlation |
age_loan_years is highly correlated with id | High correlation |
id is highly correlated with age_loan_years | High correlation |
NUMBER_OF_FAMILY_MEMBERS has 1285 (24.1%) missing values | Missing |
FIXED_MONTHLY_EXPENSES has 1285 (24.1%) missing values | Missing |
INCOME_houshold has 1285 (24.1%) missing values | Missing |
dpd has 1285 (24.1%) missing values | Missing |
dpd is highly skewed (γ1 = 23.46348154) | Skewed |
id is uniformly distributed | Uniform |
id has unique values | Unique |
outstanding_volume has unique values | Unique |
planned_installments has 87 (1.6%) zeros | Zeros |
prepaid_amount has 371 (7.0%) zeros | Zeros |
NUMBER_OF_FAMILY_MEMBERS has 2064 (38.7%) zeros | Zeros |
FIXED_MONTHLY_EXPENSES has 2807 (52.6%) zeros | Zeros |
INCOME_houshold has 2701 (50.6%) zeros | Zeros |
dpd has 1258 (23.6%) zeros | Zeros |
| Distinct count | 5336 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2675.9913793103447 |
|---|---|
| Minimum | 1 |
| Maximum | 5355 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 267.75 |
| Q1 | 1336.75 |
| median | 2672.5 |
| Q3 | 4015.25 |
| 95-th percentile | 5088.25 |
| Maximum | 5355 |
| Range | 5354 |
| Interquartile range (IQR) | 2678.5 |
Descriptive statistics
| Standard deviation | 1547.065009 |
|---|---|
| Coefficient of variation (CV) | 0.5781277999 |
| Kurtosis | -1.200888697 |
| Mean | 2675.991379 |
| Median Absolute Deviation (MAD) | 1339.5 |
| Skewness | 0.003123542139 |
| Sum | 14279090 |
| Variance | 2393410.141 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 585 | 1 | < 0.1% | |
| 2636 | 1 | < 0.1% | |
| 589 | 1 | < 0.1% | |
| 4687 | 1 | < 0.1% | |
| 2640 | 1 | < 0.1% | |
| 593 | 1 | < 0.1% | |
| 4691 | 1 | < 0.1% | |
| 2644 | 1 | < 0.1% | |
| 597 | 1 | < 0.1% | |
| Other values (5326) | 5326 | 99.8% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5355 | 1 | < 0.1% | |
| 5354 | 1 | < 0.1% | |
| 5353 | 1 | < 0.1% | |
| 5352 | 1 | < 0.1% | |
| 5351 | 1 | < 0.1% |
date_str
Real number (ℝ≥0)
| Distinct count | 287 |
|---|---|
| Unique (%) | 5.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20173700.114055436 |
|---|---|
| Minimum | 20160131.0 |
| Maximum | 20192740.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 20160131 |
|---|---|
| 5-th percentile | 20162037.5 |
| Q1 | 20167766.11 |
| median | 20172184.26 |
| Q3 | 20176660.42 |
| 95-th percentile | 20192184.15 |
| Maximum | 20192740.5 |
| Range | 32609.5 |
| Interquartile range (IQR) | 8894.312857 |
Descriptive statistics
| Standard deviation | 8580.898157 |
|---|---|
| Coefficient of variation (CV) | 0.0004253507343 |
| Kurtosis | -0.07460327278 |
| Mean | 20173700.11 |
| Median Absolute Deviation (MAD) | 4476.16359 |
| Skewness | 0.8084568429 |
| Sum | 1.076468638e+11 |
| Variance | 73631813.18 |
| Value | Count | Frequency (%) | |
| 20176660.42 | 471 | 8.8% | |
| 20172184.26 | 143 | 2.7% | |
| 20192347.17 | 140 | 2.6% | |
| 20171706.71 | 134 | 2.5% | |
| 20169011.06 | 125 | 2.3% | |
| 20170062.79 | 123 | 2.3% | |
| 20168620.4 | 123 | 2.3% | |
| 20167297.07 | 118 | 2.2% | |
| 20167766.11 | 117 | 2.2% | |
| 20170379 | 116 | 2.2% | |
| Other values (277) | 3726 | 69.8% |
| Value | Count | Frequency (%) | |
| 20160131 | 6 | 0.1% | |
| 20160180 | 13 | 0.2% | |
| 20160230.33 | 7 | 0.1% | |
| 20160280.25 | 11 | 0.2% | |
| 20160330.4 | 18 | 0.3% |
| Value | Count | Frequency (%) | |
| 20192740.5 | 14 | 0.3% | |
| 20192530.45 | 59 | 1.1% | |
| 20192347.17 | 140 | 2.6% | |
| 20192184.15 | 94 | 1.8% | |
| 20192037.5 | 49 | 0.9% |
| Distinct count | 5174 |
|---|---|
| Unique (%) | 97.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.49482295484015 |
|---|---|
| Minimum | 0.2225 |
| Maximum | 34.92460000000001 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0.2225 |
|---|---|
| 5-th percentile | 1.366416667 |
| Q1 | 2.725819444 |
| median | 6.628365385 |
| Q3 | 14.78214489 |
| 95-th percentile | 25.2825931 |
| Maximum | 34.9246 |
| Range | 34.7021 |
| Interquartile range (IQR) | 12.05632544 |
Descriptive statistics
| Standard deviation | 8.189771221 |
|---|---|
| Coefficient of variation (CV) | 0.8625512303 |
| Kurtosis | -0.300599814 |
| Mean | 9.494822955 |
| Median Absolute Deviation (MAD) | 4.460735755 |
| Skewness | 0.9401670615 |
| Sum | 50664.37529 |
| Variance | 67.07235265 |
| Value | Count | Frequency (%) | |
| 19.51 | 6 | 0.1% | |
| 19.26444444 | 5 | 0.1% | |
| 19.346875 | 5 | 0.1% | |
| 9.298235294 | 4 | 0.1% | |
| 29.50583333 | 4 | 0.1% | |
| 9.47 | 4 | 0.1% | |
| 9.258888889 | 3 | 0.1% | |
| 23.14288889 | 3 | 0.1% | |
| 1.723548387 | 3 | 0.1% | |
| 9.505833333 | 3 | 0.1% | |
| Other values (5164) | 5296 | 99.3% |
| Value | Count | Frequency (%) | |
| 0.2225 | 1 | < 0.1% | |
| 0.252 | 1 | < 0.1% | |
| 0.2766666667 | 1 | < 0.1% | |
| 0.335 | 1 | < 0.1% | |
| 0.35 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 34.9246 | 1 | < 0.1% | |
| 34.91166667 | 1 | < 0.1% | |
| 34.53583333 | 1 | < 0.1% | |
| 34.21789474 | 1 | < 0.1% | |
| 34.1508 | 1 | < 0.1% |
age_owner_years
Real number (ℝ≥0)
| Distinct count | 3267 |
|---|---|
| Unique (%) | 61.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 46.766936024571656 |
|---|---|
| Minimum | 22.270000000000003 |
| Maximum | 74.60235294117648 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 22.27 |
|---|---|
| 5-th percentile | 32.40192513 |
| Q1 | 41.72976364 |
| median | 49.71574176 |
| Q3 | 50.3825 |
| 95-th percentile | 59.11730691 |
| Maximum | 74.60235294 |
| Range | 52.33235294 |
| Interquartile range (IQR) | 8.652736364 |
Descriptive statistics
| Standard deviation | 7.690237373 |
|---|---|
| Coefficient of variation (CV) | 0.1644374857 |
| Kurtosis | 0.3995049449 |
| Mean | 46.76693602 |
| Median Absolute Deviation (MAD) | 1.334883242 |
| Skewness | -0.3508125672 |
| Sum | 249548.3706 |
| Variance | 59.13975086 |
| Value | Count | Frequency (%) | |
| 49.8825 | 33 | 0.6% | |
| 50.80026316 | 25 | 0.5% | |
| 49.549 | 25 | 0.5% | |
| 49.50741935 | 23 | 0.4% | |
| 50.59186047 | 21 | 0.4% | |
| 49.67407407 | 21 | 0.4% | |
| 50.46673913 | 21 | 0.4% | |
| 50.75871795 | 20 | 0.4% | |
| 49.79916667 | 20 | 0.4% | |
| 50.67536585 | 18 | 0.3% | |
| Other values (3257) | 5109 | 95.7% |
| Value | Count | Frequency (%) | |
| 22.27 | 1 | < 0.1% | |
| 23.76461538 | 1 | < 0.1% | |
| 23.94785714 | 1 | < 0.1% | |
| 25.08571429 | 1 | < 0.1% | |
| 25.19363636 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 74.60235294 | 1 | < 0.1% | |
| 73.78386364 | 1 | < 0.1% | |
| 73.06166667 | 1 | < 0.1% | |
| 71.80268293 | 1 | < 0.1% | |
| 71.47756098 | 1 | < 0.1% |
| Distinct count | 2340 |
|---|---|
| Unique (%) | 43.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.71496064467766 |
|---|---|
| Minimum | 2.4600000000000004 |
| Maximum | 41.99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 2.46 |
|---|---|
| 5-th percentile | 3.03 |
| Q1 | 5.02 |
| median | 10.05 |
| Q3 | 20.11 |
| 95-th percentile | 30.5625 |
| Maximum | 41.99 |
| Range | 39.53 |
| Interquartile range (IQR) | 15.09 |
Descriptive statistics
| Standard deviation | 9.236201394 |
|---|---|
| Coefficient of variation (CV) | 0.6734398759 |
| Kurtosis | -0.6812482264 |
| Mean | 13.71496064 |
| Median Absolute Deviation (MAD) | 5.07 |
| Skewness | 0.7185967562 |
| Sum | 73183.03 |
| Variance | 85.30741619 |
| Value | Count | Frequency (%) | |
| 5 | 128 | 2.4% | |
| 5.03 | 98 | 1.8% | |
| 4.97 | 80 | 1.5% | |
| 3 | 79 | 1.5% | |
| 10 | 61 | 1.1% | |
| 10.01 | 55 | 1.0% | |
| 4 | 38 | 0.7% | |
| 5.01 | 38 | 0.7% | |
| 5.01 | 37 | 0.7% | |
| 10.01 | 35 | 0.7% | |
| Other values (2330) | 4687 | 87.8% |
| Value | Count | Frequency (%) | |
| 2.46 | 1 | < 0.1% | |
| 2.72 | 1 | < 0.1% | |
| 2.9 | 1 | < 0.1% | |
| 2.96 | 1 | < 0.1% | |
| 2.96 | 5 | 0.1% |
| Value | Count | Frequency (%) | |
| 41.99 | 1 | < 0.1% | |
| 41.96 | 1 | < 0.1% | |
| 41.41 | 1 | < 0.1% | |
| 40.61 | 1 | < 0.1% | |
| 40.5 | 1 | < 0.1% |
client_rate
Real number (ℝ≥0)
| Distinct count | 1312 |
|---|---|
| Unique (%) | 24.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.048790161169415286 |
|---|---|
| Minimum | 0.02059999999999998 |
| Maximum | 0.09800000000000003 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0.0206 |
|---|---|
| 5-th percentile | 0.0269 |
| Q1 | 0.0357 |
| median | 0.0442 |
| Q3 | 0.0579 |
| 95-th percentile | 0.0879 |
| Maximum | 0.098 |
| Range | 0.0774 |
| Interquartile range (IQR) | 0.0222 |
Descriptive statistics
| Standard deviation | 0.01775843674 |
|---|---|
| Coefficient of variation (CV) | 0.3639757754 |
| Kurtosis | 0.5122280985 |
| Mean | 0.04879016117 |
| Median Absolute Deviation (MAD) | 0.0116 |
| Skewness | 0.9443781656 |
| Sum | 260.3443 |
| Variance | 0.0003153620756 |
| Value | Count | Frequency (%) | |
| 0.0579 | 154 | 2.9% | |
| 0.068 | 145 | 2.7% | |
| 0.073 | 94 | 1.8% | |
| 0.0579 | 59 | 1.1% | |
| 0.0579 | 53 | 1.0% | |
| 0.0529 | 51 | 1.0% | |
| 0.0344 | 48 | 0.9% | |
| 0.068 | 47 | 0.9% | |
| 0.0579 | 45 | 0.8% | |
| 0.0579 | 43 | 0.8% | |
| Other values (1302) | 4597 | 86.2% |
| Value | Count | Frequency (%) | |
| 0.0206 | 2 | < 0.1% | |
| 0.0207 | 1 | < 0.1% | |
| 0.0207 | 4 | 0.1% | |
| 0.0208 | 5 | 0.1% | |
| 0.021 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0.098 | 17 | 0.3% | |
| 0.098 | 14 | 0.3% | |
| 0.098 | 12 | 0.2% | |
| 0.098 | 34 | 0.6% | |
| 0.098 | 25 | 0.5% |
original_volume
Real number (ℝ≥0)
| Distinct count | 2100 |
|---|---|
| Unique (%) | 39.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 165086.50153121963 |
|---|---|
| Minimum | 1037.0 |
| Maximum | 3060751.590000002 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 1037 |
|---|---|
| 5-th percentile | 13481 |
| Q1 | 51850 |
| median | 114070 |
| Q3 | 213259.05 |
| 95-th percentile | 497760 |
| Maximum | 3060751.59 |
| Range | 3059714.59 |
| Interquartile range (IQR) | 161409.05 |
Descriptive statistics
| Standard deviation | 172615.0968 |
|---|---|
| Coefficient of variation (CV) | 1.045603942 |
| Kurtosis | 25.96738691 |
| Mean | 165086.5015 |
| Median Absolute Deviation (MAD) | 72590 |
| Skewness | 3.344493297 |
| Sum | 880901572.2 |
| Variance | 2.979597166e+10 |
| Value | Count | Frequency (%) | |
| 103700 | 258 | 4.8% | |
| 155550 | 134 | 2.5% | |
| 51850 | 130 | 2.4% | |
| 207400 | 128 | 2.4% | |
| 72590 | 103 | 1.9% | |
| 82960 | 99 | 1.9% | |
| 124440 | 93 | 1.7% | |
| 62220 | 85 | 1.6% | |
| 41480 | 75 | 1.4% | |
| 20740 | 75 | 1.4% | |
| Other values (2090) | 4156 | 77.9% |
| Value | Count | Frequency (%) | |
| 1037 | 1 | < 0.1% | |
| 2696.2 | 1 | < 0.1% | |
| 3836.9 | 1 | < 0.1% | |
| 4148 | 3 | 0.1% | |
| 4251.7 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 3060751.59 | 1 | < 0.1% | |
| 2074000 | 1 | < 0.1% | |
| 1742160 | 1 | < 0.1% | |
| 1659200 | 1 | < 0.1% | |
| 1555500 | 2 | < 0.1% |
| Distinct count | 4887 |
|---|---|
| Unique (%) | 91.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.220157159194571 |
|---|---|
| Minimum | 0.01 |
| Maximum | 16.12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 0.5623076923 |
| Q1 | 1.825432331 |
| median | 3.406742424 |
| Q3 | 6.462611434 |
| 95-th percentile | 9.479347826 |
| Maximum | 16.12 |
| Range | 16.11 |
| Interquartile range (IQR) | 4.637179103 |
Descriptive statistics
| Standard deviation | 2.916417291 |
|---|---|
| Coefficient of variation (CV) | 0.6910684084 |
| Kurtosis | -0.3364997565 |
| Mean | 4.220157159 |
| Median Absolute Deviation (MAD) | 2.040274621 |
| Skewness | 0.6713532068 |
| Sum | 22518.7586 |
| Variance | 8.505489816 |
| Value | Count | Frequency (%) | |
| 0.5276923077 | 8 | 0.1% | |
| 0.5576923077 | 8 | 0.1% | |
| 0.7252941176 | 8 | 0.1% | |
| 0.5492307692 | 8 | 0.1% | |
| 0.5669230769 | 8 | 0.1% | |
| 0.5066666667 | 7 | 0.1% | |
| 0.4758333333 | 7 | 0.1% | |
| 0.5776923077 | 7 | 0.1% | |
| 0.49 | 7 | 0.1% | |
| 0.4941666667 | 7 | 0.1% | |
| Other values (4877) | 5261 | 98.6% |
| Value | Count | Frequency (%) | |
| 0.01 | 1 | < 0.1% | |
| 0.09 | 1 | < 0.1% | |
| 0.1 | 2 | < 0.1% | |
| 0.1 | 1 | < 0.1% | |
| 0.1266666667 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 16.12 | 1 | < 0.1% | |
| 15.76727273 | 1 | < 0.1% | |
| 15.19170732 | 1 | < 0.1% | |
| 14.49342857 | 1 | < 0.1% | |
| 14.43 | 1 | < 0.1% |
| Distinct count | 5336 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 101980.55702776244 |
|---|---|
| Minimum | 391.9171428571429 |
| Maximum | 1666664.0092000002 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 391.9171429 |
|---|---|
| 5-th percentile | 5775.311763 |
| Q1 | 24140.1645 |
| median | 63205.05775 |
| Q3 | 137106.7146 |
| 95-th percentile | 325535.9448 |
| Maximum | 1666664.009 |
| Range | 1666272.092 |
| Interquartile range (IQR) | 112966.5501 |
Descriptive statistics
| Standard deviation | 117990.9137 |
|---|---|
| Coefficient of variation (CV) | 1.156994207 |
| Kurtosis | 17.01850308 |
| Mean | 101980.557 |
| Median Absolute Deviation (MAD) | 47050.52624 |
| Skewness | 3.003482294 |
| Sum | 544168252.3 |
| Variance | 1.392185573e+10 |
| Value | Count | Frequency (%) | |
| 155331.6133 | 1 | < 0.1% | |
| 91924.08429 | 1 | < 0.1% | |
| 35419.98368 | 1 | < 0.1% | |
| 383211.6678 | 1 | < 0.1% | |
| 227670.3665 | 1 | < 0.1% | |
| 38210.8505 | 1 | < 0.1% | |
| 223965.9823 | 1 | < 0.1% | |
| 391.9171429 | 1 | < 0.1% | |
| 289171.1658 | 1 | < 0.1% | |
| 45975.8552 | 1 | < 0.1% | |
| Other values (5326) | 5326 | 99.8% |
| Value | Count | Frequency (%) | |
| 391.9171429 | 1 | < 0.1% | |
| 974.498 | 1 | < 0.1% | |
| 983.2952941 | 1 | < 0.1% | |
| 1033.65 | 1 | < 0.1% | |
| 1037 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1666664.009 | 1 | < 0.1% | |
| 1280634.131 | 1 | < 0.1% | |
| 1168332.368 | 1 | < 0.1% | |
| 1155583.71 | 1 | < 0.1% | |
| 1083240.647 | 1 | < 0.1% |
| Distinct count | 5242 |
|---|---|
| Unique (%) | 98.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 826.6141932608041 |
|---|---|
| Minimum | 0.0 |
| Maximum | 14381.5572972973 |
| Zeros | 87 |
| Zeros (%) | 1.6% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 119.8299653 |
| Q1 | 318.4323333 |
| median | 568.4001935 |
| Q3 | 992.8168482 |
| 95-th percentile | 2339.158936 |
| Maximum | 14381.5573 |
| Range | 14381.5573 |
| Interquartile range (IQR) | 674.3845149 |
Descriptive statistics
| Standard deviation | 937.8308729 |
|---|---|
| Coefficient of variation (CV) | 1.134544846 |
| Kurtosis | 39.36086509 |
| Mean | 826.6141933 |
| Median Absolute Deviation (MAD) | 301.5206234 |
| Skewness | 4.792225208 |
| Sum | 4410813.335 |
| Variance | 879526.7462 |
| Value | Count | Frequency (%) | |
| 0 | 87 | 1.6% | |
| 864.16 | 5 | 0.1% | |
| 1728.34 | 3 | 0.1% | |
| 432.09 | 2 | < 0.1% | |
| 1296.25 | 2 | < 0.1% | |
| 134.5344 | 1 | < 0.1% | |
| 107.2725 | 1 | < 0.1% | |
| 1578.916538 | 1 | < 0.1% | |
| 625.6085 | 1 | < 0.1% | |
| 424.1090476 | 1 | < 0.1% | |
| Other values (5232) | 5232 | 98.1% |
| Value | Count | Frequency (%) | |
| 0 | 87 | 1.6% | |
| 12.9044 | 1 | < 0.1% | |
| 14.85857143 | 1 | < 0.1% | |
| 15.63 | 1 | < 0.1% | |
| 26.57666667 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 14381.5573 | 1 | < 0.1% | |
| 12836.15529 | 1 | < 0.1% | |
| 11693.30222 | 1 | < 0.1% | |
| 11422.21727 | 1 | < 0.1% | |
| 10909.01757 | 1 | < 0.1% |
| Distinct count | 4605 |
|---|---|
| Unique (%) | 86.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2305.9657315674654 |
|---|---|
| Minimum | 0.0 |
| Maximum | 204997.79 |
| Zeros | 371 |
| Zeros (%) | 7.0% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 230.4444444 |
| median | 821.7166327 |
| Q3 | 2357.514952 |
| 95-th percentile | 7692.518359 |
| Maximum | 204997.79 |
| Range | 204997.79 |
| Interquartile range (IQR) | 2127.070508 |
Descriptive statistics
| Standard deviation | 7089.313809 |
|---|---|
| Coefficient of variation (CV) | 3.07433615 |
| Kurtosis | 366.9873698 |
| Mean | 2305.965732 |
| Median Absolute Deviation (MAD) | 729.1215486 |
| Skewness | 16.3062481 |
| Sum | 12304633.14 |
| Variance | 50258370.29 |
| Value | Count | Frequency (%) | |
| 0 | 371 | 7.0% | |
| 414.8 | 28 | 0.5% | |
| 1037 | 14 | 0.3% | |
| 622.2 | 12 | 0.2% | |
| 1244.4 | 12 | 0.2% | |
| 518.5 | 11 | 0.2% | |
| 207.4 | 10 | 0.2% | |
| 829.6 | 10 | 0.2% | |
| 311.1 | 10 | 0.2% | |
| 2074 | 10 | 0.2% | |
| Other values (4595) | 4848 | 90.9% |
| Value | Count | Frequency (%) | |
| 0 | 371 | 7.0% | |
| 2.728947368 | 1 | < 0.1% | |
| 3.013846154 | 1 | < 0.1% | |
| 3.988461538 | 2 | < 0.1% | |
| 4.320833333 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 204997.79 | 1 | < 0.1% | |
| 188878.56 | 1 | < 0.1% | |
| 172237.7067 | 1 | < 0.1% | |
| 162809 | 1 | < 0.1% | |
| 114417.502 | 1 | < 0.1% |
| Distinct count | 9 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 1285 |
| Missing (%) | 24.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.2001974821031844 |
|---|---|
| Minimum | 0.0 |
| Maximum | 50.0 |
| Zeros | 2064 |
| Zeros (%) | 38.7% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 50 |
| Range | 50 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.812892092 |
|---|---|
| Coefficient of variation (CV) | 1.510494831 |
| Kurtosis | 257.8625276 |
| Mean | 1.200197482 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 10.11923558 |
| Sum | 4862 |
| Variance | 3.286577739 |
| Value | Count | Frequency (%) | |
| 0 | 2064 | 38.7% | |
| 2 | 675 | 12.6% | |
| 1 | 509 | 9.5% | |
| 3 | 390 | 7.3% | |
| 4 | 338 | 6.3% | |
| 5 | 63 | 1.2% | |
| 6 | 7 | 0.1% | |
| 8 | 3 | 0.1% | |
| 50 | 2 | < 0.1% | |
| (Missing) | 1285 | 24.1% |
| Value | Count | Frequency (%) | |
| 0 | 2064 | 38.7% | |
| 1 | 509 | 9.5% | |
| 2 | 675 | 12.6% | |
| 3 | 390 | 7.3% | |
| 4 | 338 | 6.3% |
| Value | Count | Frequency (%) | |
| 50 | 2 | < 0.1% | |
| 8 | 3 | 0.1% | |
| 6 | 7 | 0.1% | |
| 5 | 63 | 1.2% | |
| 4 | 338 | 6.3% |
| Distinct count | 150 |
|---|---|
| Unique (%) | 3.7% |
| Missing | 1285 |
| Missing (%) | 24.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 365.76509898790425 |
|---|---|
| Minimum | 0.0 |
| Maximum | 11700.0 |
| Zeros | 2807 |
| Zeros (%) | 52.6% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 390 |
| 95-th percentile | 1790.3795 |
| Maximum | 11700 |
| Range | 11700 |
| Interquartile range (IQR) | 390 |
Descriptive statistics
| Standard deviation | 775.3740087 |
|---|---|
| Coefficient of variation (CV) | 2.119868765 |
| Kurtosis | 26.03086803 |
| Mean | 365.765099 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.820677667 |
| Sum | 1481714.416 |
| Variance | 601204.8534 |
| Value | Count | Frequency (%) | |
| 0 | 2807 | 52.6% | |
| 650 | 172 | 3.2% | |
| 1300 | 143 | 2.7% | |
| 390 | 89 | 1.7% | |
| 780 | 84 | 1.6% | |
| 1040 | 66 | 1.2% | |
| 1950 | 58 | 1.1% | |
| 1560 | 57 | 1.1% | |
| 130 | 55 | 1.0% | |
| 2600 | 51 | 1.0% | |
| Other values (140) | 469 | 8.8% | |
| (Missing) | 1285 | 24.1% |
| Value | Count | Frequency (%) | |
| 0 | 2807 | 52.6% | |
| 13 | 1 | < 0.1% | |
| 26 | 1 | < 0.1% | |
| 65 | 21 | 0.4% | |
| 78 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 11700 | 1 | < 0.1% | |
| 7800 | 2 | < 0.1% | |
| 6760 | 1 | < 0.1% | |
| 6500 | 6 | 0.1% | |
| 5443.88 | 1 | < 0.1% |
| Distinct count | 1264 |
|---|---|
| Unique (%) | 31.2% |
| Missing | 1285 |
| Missing (%) | 24.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4423.720887682054 |
|---|---|
| Minimum | -28130.388000000014 |
| Maximum | 335624.6520000002 |
| Zeros | 2701 |
| Zeros (%) | 50.6% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | -28130.388 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 4345.614 |
| 95-th percentile | 20726.04 |
| Maximum | 335624.652 |
| Range | 363755.04 |
| Interquartile range (IQR) | 4345.614 |
Descriptive statistics
| Standard deviation | 12373.08711 |
|---|---|
| Coefficient of variation (CV) | 2.796986389 |
| Kurtosis | 153.4802834 |
| Mean | 4423.720888 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 8.738979445 |
| Sum | 17920493.32 |
| Variance | 153093284.7 |
| Value | Count | Frequency (%) | |
| 0 | 2701 | 50.6% | |
| 3000 | 7 | 0.1% | |
| 7200 | 6 | 0.1% | |
| 3600 | 5 | 0.1% | |
| 6960 | 5 | 0.1% | |
| 10800 | 4 | 0.1% | |
| 2640 | 4 | 0.1% | |
| 8400 | 4 | 0.1% | |
| 2520 | 4 | 0.1% | |
| 7800 | 3 | 0.1% | |
| Other values (1254) | 1308 | 24.5% | |
| (Missing) | 1285 | 24.1% |
| Value | Count | Frequency (%) | |
| -28130.388 | 1 | < 0.1% | |
| -3530.28 | 1 | < 0.1% | |
| -18.948 | 1 | < 0.1% | |
| 0 | 2701 | 50.6% | |
| 720 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 335624.652 | 1 | < 0.1% | |
| 139273.416 | 1 | < 0.1% | |
| 134381.7 | 1 | < 0.1% | |
| 134381.7 | 1 | < 0.1% | |
| 133655.664 | 2 | < 0.1% |
| Distinct count | 921 |
|---|---|
| Unique (%) | 22.7% |
| Missing | 1285 |
| Missing (%) | 24.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.084392084942053 |
|---|---|
| Minimum | 0.0 |
| Maximum | 295.6111111111111 |
| Zeros | 1258 |
| Zeros (%) | 23.6% |
| Memory size | 41.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.0625 |
| Q3 | 0.2989361702 |
| 95-th percentile | 3.411538462 |
| Maximum | 295.6111111 |
| Range | 295.6111111 |
| Interquartile range (IQR) | 0.2989361702 |
Descriptive statistics
| Standard deviation | 8.972407119 |
|---|---|
| Coefficient of variation (CV) | 8.274135568 |
| Kurtosis | 640.8029831 |
| Mean | 1.084392085 |
| Median Absolute Deviation (MAD) | 0.0625 |
| Skewness | 23.46348154 |
| Sum | 4392.872336 |
| Variance | 80.50408952 |
| Value | Count | Frequency (%) | |
| 0 | 1258 | 23.6% | |
| 0.02 | 92 | 1.7% | |
| 0.08333333333 | 80 | 1.5% | |
| 0.07692307692 | 61 | 1.1% | |
| 0.07142857143 | 52 | 1.0% | |
| 0.09090909091 | 51 | 1.0% | |
| 0.0625 | 46 | 0.9% | |
| 0.05555555556 | 36 | 0.7% | |
| 0.1 | 36 | 0.7% | |
| 0.05882352941 | 36 | 0.7% | |
| Other values (911) | 2303 | 43.2% | |
| (Missing) | 1285 | 24.1% |
| Value | Count | Frequency (%) | |
| 0 | 1258 | 23.6% | |
| 0.02 | 92 | 1.7% | |
| 0.02040816327 | 14 | 0.3% | |
| 0.02083333333 | 7 | 0.1% | |
| 0.02127659574 | 11 | 0.2% |
| Value | Count | Frequency (%) | |
| 295.6111111 | 1 | < 0.1% | |
| 261.5365854 | 1 | < 0.1% | |
| 230.8064516 | 1 | < 0.1% | |
| 180.0540541 | 1 | < 0.1% | |
| 163.3902439 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| id | date_str | years_to_matur | age_owner_years | original_matur_years | client_rate | original_volume | age_loan_years | outstanding_volume | planned_installments | prepaid_amount | NUMBER_OF_FAMILY_MEMBERS | FIXED_MONTHLY_EXPENSES | INCOME_houshold | dpd | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2.018020e+07 | 9.821220 | 45.883171 | 25.01 | 0.0342 | 51850.00 | 15.191707 | 29288.497805 | 209.631707 | 0.000000 | 4.0 | 1040.000 | 0.000 | 0.146341 |
| 1 | 2 | 2.017068e+07 | 5.809167 | 57.493333 | 21.93 | 0.0342 | 155550.00 | 16.120000 | 75720.080833 | 634.615000 | 1907.967222 | 0.0 | 0.000 | 0.000 | 0.027778 |
| 2 | 3 | 2.016518e+07 | 9.249091 | 44.113182 | 25.02 | 0.0432 | 70516.00 | 15.767273 | 39353.626364 | 287.257727 | 1636.945455 | 3.0 | 1950.000 | 19200.000 | 0.000000 |
| 3 | 4 | 2.017218e+07 | 5.580769 | 41.031795 | 20.01 | 0.0850 | 31110.00 | 14.430000 | 14291.302821 | 165.646667 | 277.665897 | 2.0 | 260.000 | 2508.432 | 0.641026 |
| 4 | 5 | 2.017038e+07 | 4.773429 | 55.784857 | 19.27 | 0.0279 | 336431.43 | 14.493429 | 31272.502857 | 505.280857 | 630.223143 | NaN | NaN | NaN | NaN |
| 5 | 6 | 2.017428e+07 | 11.925909 | 43.415000 | 25.64 | 0.0279 | 72590.00 | 13.717500 | 30239.141591 | 210.730000 | 579.494091 | 2.0 | 1757.418 | 8111.148 | 0.000000 |
| 6 | 7 | 2.017535e+07 | 12.185957 | 40.881702 | 25.51 | 0.0279 | 41480.00 | 13.323191 | 20964.954894 | 142.610000 | 373.241702 | NaN | NaN | NaN | NaN |
| 7 | 8 | 2.016973e+07 | 11.075758 | 48.151515 | 24.58 | 0.0279 | 74041.80 | 13.499091 | 34384.615758 | 257.080000 | 903.744848 | NaN | NaN | NaN | NaN |
| 8 | 9 | 2.017666e+07 | 4.555800 | 65.900600 | 18.48 | 0.0260 | 125341.39 | 13.929800 | 37794.061400 | 594.411800 | 510.826200 | NaN | NaN | NaN | NaN |
| 9 | 10 | 2.018305e+07 | 7.864118 | 48.156471 | 20.06 | 0.0259 | 285175.00 | 12.198824 | 104801.444118 | 1109.020000 | 0.000000 | 0.0 | 0.000 | 0.000 | 0.264706 |
Last rows
| id | date_str | years_to_matur | age_owner_years | original_matur_years | client_rate | original_volume | age_loan_years | outstanding_volume | planned_installments | prepaid_amount | NUMBER_OF_FAMILY_MEMBERS | FIXED_MONTHLY_EXPENSES | INCOME_houshold | dpd | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5326 | 5346 | 2.019253e+07 | 19.194545 | 45.975455 | 19.70 | 0.0367 | 350292.071818 | 0.505455 | 337311.960000 | 0.000000 | 0.000000 | 5.0 | 2080.000 | 0.000 | 0.090909 |
| 5327 | 5347 | 2.019253e+07 | 25.397273 | 35.163636 | 25.90 | 0.0359 | 195370.800000 | 0.505455 | 195370.800000 | 0.000000 | 0.000000 | 4.0 | 1560.000 | 0.000 | 0.000000 |
| 5328 | 5348 | 2.019253e+07 | 24.754545 | 35.883636 | 25.25 | 0.0372 | 131510.454545 | 0.500000 | 124345.727273 | 0.000000 | 0.000000 | 0.0 | 0.000 | 0.000 | 0.454545 |
| 5329 | 5349 | 2.019253e+07 | 26.726364 | 30.455455 | 27.22 | 0.0371 | 248880.000000 | 0.500000 | 247251.742727 | 356.259091 | 0.000000 | 1.0 | 780.000 | 2400.000 | 0.000000 |
| 5330 | 5350 | 2.019274e+07 | 13.901000 | 39.626000 | 14.44 | 0.0356 | 223676.302000 | 0.542000 | 213838.411000 | 314.523000 | 0.000000 | 0.0 | 0.000 | 0.000 | 0.100000 |
| 5331 | 5351 | 2.019253e+07 | 19.859091 | 38.545455 | 20.36 | 0.0362 | 227106.770909 | 0.500000 | 216174.031818 | 482.816364 | 424.227273 | 4.0 | 780.000 | 0.000 | 0.090909 |
| 5332 | 5352 | 2.019274e+07 | 19.594000 | 30.414000 | 20.13 | 0.0356 | 207400.000000 | 0.542000 | 204755.477000 | 589.554000 | 69.431000 | 2.0 | 1755.000 | 6338.736 | 0.000000 |
| 5333 | 5353 | 2.019253e+07 | 9.938182 | 35.255455 | 10.44 | 0.0373 | 79189.090909 | 0.500000 | 71186.596364 | 386.250000 | 433.654545 | 1.0 | 1300.000 | 0.000 | 0.090909 |
| 5334 | 5354 | 2.019253e+07 | 29.644545 | 29.554545 | 30.14 | 0.0356 | 362950.000000 | 0.500000 | 360658.036364 | 517.690000 | 0.000000 | 2.0 | 1574.417 | 5382.612 | 0.000000 |
| 5335 | 5355 | 2.019253e+07 | 9.560909 | 37.079091 | 10.06 | 0.0369 | 186660.000000 | 0.495455 | 174494.760000 | 1249.769091 | 787.611818 | 1.0 | 1950.000 | 0.000 | 0.000000 |